AITopics | express uncertainty

Collaborating Authors

express uncertainty

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

Neural Information Processing SystemsMar-18-2026, 08:56:10 GMT

Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we propose a novel calibration method that can be used to combat hallucinations. We add a special [IDK] ("I Don't Know") token to the model's vocabulary and introduce an objective function that shifts probability mass to the [IDK] token for incorrect predictions. This approach allows the model to express uncertainty in its output explicitly. We evaluate our proposed method across multiple model architectures and factual downstream tasks.We find that models trained with our method are able to express uncertainty in places where they would previously make mistakes while suffering only a small loss of encoded knowledge. We further perform extensive ablation studies of multiple variations of our approach and provide a detailed analysis of the precision-recall tradeoff of our method.

artificial intelligence, natural language, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.60)

Add feedback

UNCLE: Benchmarking Uncertainty Expressions in Long-Form Generation

Yang, Ruihan, Zhang, Caiqi, Zhang, Zhisong, Huang, Xinting, Yu, Dong, Collier, Nigel, Yang, Deqing

arXiv.org Artificial IntelligenceOct-10-2025

Large Language Models (LLMs) are prone to hallucination, particularly in long-form generations. A promising direction to mitigate hallucination is to teach LLMs to express uncertainty explicitly when they lack sufficient knowledge. However, existing work lacks direct and fair evaluation of LLMs' ability to express uncertainty effectively in long-form generation. To address this gap, we first introduce UNCLE, a benchmark designed to evaluate uncertainty expression in both long- and short-form question answering (QA). UNCLE covers five domains and includes more than 1,000 entities, each with paired short- and long-form QA items. Our dataset is the first to directly link short- and long-form QA through aligned questions and gold-standard answers. Along with UNCLE, we propose a suite of new metrics to assess the models' capabilities to selectively express uncertainty. We then demonstrate that current models fail to convey uncertainty appropriately in long-form generation. We further explore both prompt-based and training-based methods to improve models' performance, with the training-based methods yielding greater gains. Further analysis of alignment gaps between short- and long-form uncertainty expression highlights promising directions for future research using UNCLE.

express uncertainty, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.16922

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.69)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

Neural Information Processing SystemsMay-26-2025, 16:54:12 GMT

artificial intelligence, explicit modeling, natural language, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.64)

Add feedback

LoGU: Long-form Generation with Uncertainty Expressions

Yang, Ruihan, Zhang, Caiqi, Zhang, Zhisong, Huang, Xinting, Yang, Sen, Collier, Nigel, Yu, Dong, Yang, Deqing

arXiv.org Artificial IntelligenceOct-24-2024

While Large Language Models (LLMs) demonstrate impressive capabilities, they still struggle with generating factually incorrect content (i.e., hallucinations). A promising approach to mitigate this issue is enabling models to express uncertainty when unsure. Previous research on uncertainty modeling has primarily focused on short-form QA, but realworld applications often require much longer responses. In this work, we introduce the task of Long-form Generation with Uncertainty(LoGU). We identify two key challenges: Uncertainty Suppression, where models hesitate to express uncertainty, and Uncertainty Misalignment, where models convey uncertainty inaccurately. To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline. Our framework adopts a divide-and-conquer strategy, refining uncertainty based on atomic claims. The collected data are then used in training through supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance uncertainty expression. Extensive experiments on three long-form instruction following datasets show that our method significantly improves accuracy, reduces hallucinations, and maintains the comprehensiveness of responses.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.14309

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)
Personal > Obituary (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Health & Medicine > Health Care Technology > Telehealth (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Prudent Silence or Foolish Babble? Examining Large Language Models' Responses to the Unknown

Liu, Genglin, Wang, Xingyao, Yuan, Lifan, Chen, Yangyi, Peng, Hao

arXiv.org Artificial IntelligenceNov-16-2023

Large Language Models (LLMs) often struggle when faced with situations where they lack the prerequisite knowledge to generate a sensical response. In these cases, models tend to fabricate and hallucinate, rather than appropriately signaling uncertainty as humans would. This behavior misaligns with human conversational norms and presents challenges surrounding responsible and ethical AI development. This work aims to systematically investigate LLMs' behaviors in such situations. We curate an adversarial question-answering benchmark containing unanswerable questions targeting information absent from the LLM's training data. Concretely, these unanswerable questions contain non-existent concepts or false premises. When presented with such unanswerable questions, an LLM should appropriately convey uncertainty, and be able to challenge the premise and refuse to generate a response. While facing answerable valid questions, a model should demonstrate a positive correlation between accuracy and confidence. Using a model-agnostic unified confidence elicitation approach, we observe that LLMs that have gone through instruction finetuning and reinforcement learning from human feedback (RLHF) perform significantly better than their counterparts that do not. Moreover, uncertainty expression 1 through our elicitation method does not always stay consistent with the perceived confidence of the direct response of an LLM. Our findings call for further research into teaching LLMs to proactively and reliably express uncertainty.

accuracy, language model, llm, (16 more...)

arXiv.org Artificial Intelligence

2311.09731

Country:

Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Researchers help AI express uncertainty to improve health monitoring tech

#artificialintelligenceApr-18-2023, 00:51:47 GMT

A team of engineering and health researchers has developed a tool that improves the ability of electronic devices to detect when a human patient is coughing, which has applications in health monitoring. The new tool relies on an advanced artificial intelligence (AI) algorithm that helps the AI better identify uncertainty when faced with unexpected data in real-world situations. The paper, "Robust Cough Detection with Out-of-Distribution Detection," is published in the IEEE Journal of Biomedical and Health Informatics. "When AI is being trained to identify the sound of coughing, this is usually done with'clean' data--there is not a lot of background noise or confusing sounds," says Edgar Lobaton, corresponding author of a paper on the work and an associate professor of electrical and computer engineering at North Carolina State University. "But the real world is full of background noise and confusing sounds. So previous cough detection technologies often struggled with'false positives'--they would say that someone was coughing even if nobody was coughing. "We've developed an algorithm that helps us address this problem by allowing an AI to express uncertainty.

algorithm, express uncertainty, health, (13 more...)

#artificialintelligence

Country: North America > United States > North Carolina (0.26)

Industry: Health & Medicine > Consumer Health (0.79)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Understanding the Uncertainty Loop of Human-Robot Interaction

Leusmann, Jan, Wang, Chao, Gienger, Michael, Schmidt, Albrecht, Mayer, Sven

arXiv.org Artificial IntelligenceMar-14-2023

Recently the field of Human-Robot Interaction gained popularity, due to the wide range of possibilities of how robots can support humans during daily tasks. One form of supportive robots are socially assistive robots which are specifically built for communicating with humans, e.g., as service robots or personal companions. As they understand humans through artificial intelligence, these robots will at some point make wrong assumptions about the humans' current state and give an unexpected response. In human-human conversations, unexpected responses happen frequently. However, it is currently unclear how such robots should act if they understand that the human did not expect their response, or even showing the uncertainty of their response in the first place. For this, we explore the different forms of potential uncertainties during human-robot conversations and how humanoids can, through verbal and non-verbal cues, communicate these uncertainties.

artificial intelligence, interaction, robot, (13 more...)

arXiv.org Artificial Intelligence

2303.07889

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.06)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.06)
(5 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.97)
Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.69)

Add feedback